accuracy score
- North America > United States (0.04)
- Asia > China (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (0.94)
- Health & Medicine (0.93)
- Education (0.68)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Neuchâtel > Neuchâtel (0.04)
- (2 more...)
- Leisure & Entertainment (0.47)
- Information Technology (0.46)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
End-to-End Weak Supervision Carnegie Mellon University 2
Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art approaches that do not use any labeled training data, however, require two separate modeling steps: Learning a probabilistic latent variable model based on the WS sources - making assumptions that rarely hold in practice - followed by downstream model training. Importantly, the first step of modeling does not consider the performance of the downstream model. To address these caveats we propose an end-to-end approach for directly learning the downstream model by maximizing its agreement with probabilistic labels generated by reparameterizing prior probabilistic posteriors with a neural network. Our results show improved performance over prior work in terms of end model performance on downstream test sets, as well as in terms of improved robustness to dependencies among weak supervision sources.
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- (2 more...)
Demystifying the trend of the healthcare index: Is historical price a key driver?
Sadhukhan, Payel, Gupta, Samrat, Ghosh, Subhasis, Chakraborty, Tanujit
Healthcare sector indices consolidate the economic health of pharmaceutical, biotechnology, and healthcare service firms. The short-term movements in these indices are closely intertwined with capital allocation decisions affecting research and development investment, drug availability, and long-term health outcomes. This research investigates whether historical open-high-low-close (OHLC) index data contain sufficient information for predicting the directional movement of the opening index on the subsequent trading day. The problem is formulated as a supervised classification task involving a one-step-ahead rolling window. A diverse feature set is constructed, comprising original prices, volatility-based technical indicators, and a novel class of nowcasting features derived from mutual OHLC ratios. The framework is evaluated on data from healthcare indices in the U.S. and Indian markets over a five-year period spanning multiple economic phases, including the COVID-19 pandemic. The results demonstrate robust predictive performance, with accuracy exceeding 0.8 and Matthews correlation coefficients above 0.6. Notably, the proposed nowcasting features have emerged as a key determinant of the market movement. We have employed the Shapley-based explainability paradigm to further elucidate the contribution of the features: outcomes reveal the dominant role of the nowcasting features, followed by a more moderate contribution of original prices. This research offers a societal utility: the proposed features and model for short-term forecasting of healthcare indices can reduce information asymmetry and support a more stable and equitable health economy.
- North America > United States (1.00)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- (5 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Banking & Finance > Trading (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.48)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Hybrid Models for Natural Language Reasoning: The Case of Syllogistic Logic
Guzmán, Manuel Vargas, Szymanik, Jakub, Malicki, Maciej
Despite the remarkable progress in neural models, their ability to generalize, a cornerstone for applications like logical reasoning, remains a critical challenge. We delineate two fundamental aspects of this ability: compositionality, the capacity to abstract atomic logical rules underlying complex inferences, and recursiveness, the aptitude to build intricate representations through iterative application of inference rules. In the literature, these two aspects are often confounded together under the umbrella term of generalization. To sharpen this distinction, we investigated the logical generalization capabilities of pre-trained large language models (LLMs) using the syllogistic fragment as a benchmark for natural language reasoning. Though simple, this fragment provides a foundational yet expressive subset of formal logic that supports controlled evaluation of essential reasoning abilities. Our findings reveal a significant disparity: while LLMs demonstrate reasonable proficiency in recursiveness, they struggle with compositionality. To overcome these limitations and establish a reliable logical prover, we propose a hybrid architecture integrating symbolic reasoning with neural computation. This synergistic interaction enables robust and efficient inference, neural components accelerate processing, while symbolic reasoning ensures completeness. Our experiments show that high efficiency is preserved even with relatively small neural components. As part of our proposed methodology, this analysis gives a rationale and highlights the potential of hybrid models to effectively address key generalization barriers in neural reasoning systems.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (2 more...)
- North America > United States (0.04)
- Asia > China (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (0.94)
- Health & Medicine (0.93)
- Education (0.68)
MatQnA: A Benchmark Dataset for Multi-modal Large Language Models in Materials Characterization and Analysis
Weng, Yonghao, Gao, Liqiang, Zhu, Linwu, Huang, Jian
Recently, large language models (LLMs) have achieved remarkable breakthroughs in general domains such as programming and writing, and have demonstrated strong potential in various scientific research scenarios. However, the capabilities of AI models in the highly specialized field of materials characterization and analysis have not yet been systematically or sufficiently validated. To address this gap, we present MatQnA, the first multi-modal benchmark dataset specifically designed for material characterization techniques. MatQnA includes ten mainstream characterization methods, such as X-ray Photoelectron Spectroscopy (XPS), X-ray Diffraction (XRD), Scanning Electron Microscopy (SEM), Transmission Electron Microscopy (TEM), etc. We employ a hybrid approach combining LLMs with human-in-the-loop validation to construct high-quality question-answer pairs, integrating both multiple-choice and subjective questions. Our preliminary evaluation results show that the most advanced multi-modal AI models (e.g., GPT-4.1, Claude 4, Gemini 2.5, and Doubao Vision Pro 32K) have already achieved nearly 90% accuracy on objective questions in materials data interpretation and analysis tasks, demonstrating strong potential for applications in materials characterization and analysis. The MatQnA dataset is publicly available at https://huggingface.co/datasets/richardhzgg/matQnA.
- Asia > China > Zhejiang Province > Hangzhou (0.05)
- North America > United States (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (2 more...)
- Health & Medicine (0.94)
- Materials (0.68)
- Education (0.67)
The Tracking Machine Learning challenge : Throughput phase
Amrouche, Sabrina, Basara, Laurent, Calafiura, Paolo, Emeliyanov, Dmitry, Estrade, Victor, Farrell, Steven, Germain, Cécile, Gligorov, Vladimir Vava, Golling, Tobias, Gorbunov, Sergey, Gray, Heather, Guyon, Isabelle, Hushchyn, Mikhail, Innocente, Vincenzo, Kiehn, Moritz, Kunze, Marcel, Moyse, Edward, Rousseau, David, Salzburger, Andreas, Ustyuzhanin, Andrey, Vlimant, Jean-Roch
This paper reports on the second "Throughput" phase of the Tracking Machine Learning (TrackML) challenge on the Codalab platform. As in the first "Accuracy" phase, the participants had to solve a difficult experimental problem linked to tracking accurately the trajectory of particles as e.g. created at the Large Hadron Collider (LHC): given O($10^5$) points, the participants had to connect them into O($10^4$) individual groups that represent the particle trajectories which are approximated helical. While in the first phase only the accuracy mattered, the goal of this second phase was a compromise between the accuracy and the speed of inference. Both were measured on the Codalab platform where the participants had to upload their software. The best three participants had solutions with good accuracy and speed an order of magnitude faster than the state of the art when the challenge was designed. Although the core algorithms were less diverse than in the first phase, a diversity of techniques have been used and are described in this paper. The performance of the algorithms are analysed in depth and lessons derived.
- Europe > Switzerland > Geneva > Geneva (0.28)
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- (6 more...)
English Pronunciation Evaluation without Complex Joint Training: LoRA Fine-tuned Speech Multimodal LLM
This study demonstrates that a Multimodal Large Language Model (MLLM) adapted via Low-Rank Adaptation (LoRA) can perform both Automatic Pronunciation Assessment (APA) and Mispronunciation Detection and Diagnosis (MDD) simultaneously. Leveraging Microsoft's Phi-4-multimodal-instruct, our fine-tuning method eliminates the need for complex architectural changes or separate training procedures conventionally required for these distinct tasks. Fine-tuned on the Speechocean762 dataset, the pronunciation evaluation scores predicted by the model exhibited a strong Pearson Correlation Coefficient (PCC > 0.7) with human-assigned scores, while achieving low Word Error Rate (WER) and Phoneme Error Rate (PER) (both < 0.15). Notably, fine-tuning only the LoRA layers was sufficient to achieve performance levels comparable to those achieved by fine-tuning all audio layers. This research highlights that an integrated pronunciation assessment system can be established by adapting large multimodal models without full fine-tuning, utilizing a significantly simpler training methodology compared to previous joint models designed for simultaneous APA and MDD. This efficient LoRA-based approach paves the way for more accessible, integrated, and effective Computer-Assisted Pronunciation Training (CAPT) technologies for English L2 learners.
Benchmarking the State of Networks with a Low-Cost Method Based on Reservoir Computing
Reimers, Felix Simon, Peters, Carl-Hendrik, Nichele, Stefano
Using data from mobile network utilization in Norway, we showcase the possibility of monitoring the state of communication and mobility networks with a non-invasive, low-cost method. This method transforms the network data into a model within the framework of reservoir computing and then measures the model's performance on proxy tasks. Experimentally, we show how the performance on these proxies relates to the state of the network. A key advantage of this approach is that it uses readily available data sets and leverages the reservoir computing framework for an inexpensive and largely agnostic method. Data from mobile network utilization is available in an anonymous, aggregated form with multiple snapshots per day. This data can be treated like a weighted network. Reservoir computing allows the use of weighted, but untrained networks as a machine learning tool. The network, initialized as a so-called echo state network (ESN), projects incoming signals into a higher dimensional space, on which a single trained layer operates. This consumes less energy than deep neural networks in which every weight of the network is trained. We use neuroscience inspired tasks and trained our ESN model to solve them. We then show how the performance depends on certain network configurations and also how it visibly decreases when perturbing the network. While this work serves as proof of concept, we believe it can be elevated to be used for near-real-time monitoring as well as the identification of possible weak spots of both mobile communication networks as well as transportation networks.
- Europe > Norway > Eastern Norway > Oslo (0.05)
- Europe > Switzerland > Lucerne > Lucerne (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- Telecommunications (1.00)
- Information Technology > Networks (1.00)